Group 18: FINDING GENE PATTERNS IN BREAST CANCER DATA

Group 18

Introduction:

Most common cancer in women worldwide

1 in 8 diagnosed

Many subtypes require a large collection of data

Question: Which genes are differentially expressed in different subtypes of cancer?

General workflow

General wokflow

EXPLORATORY ANALYSIS AND TIDY:

Cleaning procedure

EXPLORATORY ANALYSIS AND TIDY:

Ratio between male and female

Age of femmale patients stratified by cancer status

Country of origin of female patients

Hitological type of patient samples

DESEQ Analysis:

DESEQ workflow

DESEQ ANALYSIS:

Gene ENSG00000206585

Gene ENSG00000206652

Enriched pathways

Volcano plot

PCA Analysis:

Here is an analysis of PCA plots showing the scree and cumulative variance explained.

Explained variance

Cumulative variance

The high dimentionality required to explain 85% of the variability of the data shows that cancer analysis is a difficult task.

PCA Analysis:

Different PCs are influenced by distinct sets of genes

Overlapped clustering of patients
  • The highlighted genes for each PC might be linked to specific biological pathways or processes, as they represent the main drivers of variance for the data. -The PCA shows significant overlap between “TUMOR FREE” and “WITH TUMOR,” indicating no clear separation of cancer statuses.

Discussion: Biological insights

  • We can see that the DE genes in the data affect most importantly X pathways
  • This makes/not makes sense with the literature as 1,2,3

Conclusion: